Overview

Dataset Statistics

Number of Variables 18
Number of Rows 8760
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 4.9 MB
Average Row Size in Memory 587.2 B
Variable Types
  • Numerical: 8
  • Categorical: 10

Dataset Insights

rain is skewed Skewed
count is skewed Skewed
date has a high cardinality: 365 distinct values High Cardinality
date has constant length 10 Constant Length
year has constant length 4 Constant Length
dayofweek_n has constant length 1 Constant Length
season has constant length 6 Constant Length
temp has 137 (1.56%) negatives Negatives
rain has 7862 (89.75%) zeros Zeros
count has 1794 (20.48%) zeros Zeros

Variables


rain

numerical

Approximate Distinct Count 48
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 0.06842
Minimum 0
Maximum 10.3
Zeros 7862
Zeros (%) 89.8%
Negatives 0
Negatives (%) 0.0%
  • rain is skewed right (γ1 = 10.0374)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 0.2
Maximum 10.3
Range 10.3
IQR 0

Descriptive Statistics

Mean 0.06842
Standard Deviation 0.3534
Variance 0.1249
Sum 599.4
Skewness 10.0374
Kurtosis 155.888
Coefficient of Variation 5.1648
  • rain is not normally distributed (p-value 4.340842751897173e-25)
  • rain has 898 outliers

temp

numerical

Approximate Distinct Count 293
Approximate Unique (%) 3.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 10.188
Minimum -4.5
Maximum 26.3
Zeros 12
Zeros (%) 0.1%
Negatives 137
Negatives (%) 1.6%
  • temp is skewed right (γ1 = 0.0771)

Quantile Statistics

Minimum -4.5
5-th Percentile 2
Q1 6.6
Median 10
Q3 13.9
95-th Percentile 18.3
Maximum 26.3
Range 30.8
IQR 7.3

Descriptive Statistics

Mean 10.188
Standard Deviation 5.0366
Variance 25.3668
Sum 89246.7
Skewness 0.07705
Kurtosis -0.3422
Coefficient of Variation 0.4944
  • temp is not normally distributed (p-value 2.1619946558906635e-09)
  • temp has 19 outliers

rhum

numerical

Approximate Distinct Count 69
Approximate Unique (%) 0.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 82.3305
Minimum 24
Maximum 100
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • rhum is skewed left (γ1 = -0.8464)

Quantile Statistics

Minimum 24
5-th Percentile 59
Q1 75
Median 84
Q3 91
95-th Percentile 98
Maximum 100
Range 76
IQR 16

Descriptive Statistics

Mean 82.3305
Standard Deviation 11.6707
Variance 136.2053
Sum 721215
Skewness -0.8464
Kurtosis 0.4797
Coefficient of Variation 0.1418
  • rhum has 120 outliers

wdsp

numerical

Approximate Distinct Count 33
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 8.6355
Minimum 1
Maximum 35
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • wdsp is skewed right (γ1 = 1.0078)

Quantile Statistics

Minimum 1
5-th Percentile 3
Q1 5
Median 8
Q3 11
95-th Percentile 17
Maximum 35
Range 34
IQR 6

Descriptive Statistics

Mean 8.6355
Standard Deviation 4.446
Variance 19.7668
Sum 75647
Skewness 1.0078
Kurtosis 1.5549
Coefficient of Variation 0.5148
  • wdsp is not normally distributed (p-value 0.00031864288792944384)
  • wdsp has 144 outliers

date

categorical

Approximate Distinct Count 365
Approximate Unique (%) 4.2%
Missing 0
Missing (%) 0.0%
Memory Size 641.6 KB

Length

Mean 10
Standard Deviation 0
Median 10
Minimum 10
Maximum 10

Sample

1st row 2021-03-01
2nd row 2021-03-01
3rd row 2021-03-01
4th row 2021-03-01
5th row 2021-03-01

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 17520
Decimal Number 70080
  • date has words of constant length

hour

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 11.5
Minimum 0
Maximum 23
Zeros 365
Zeros (%) 4.2%
Negatives 0
Negatives (%) 0.0%

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 5.75
Median 11.5
Q3 17.25
95-th Percentile 22
Maximum 23
Range 23
IQR 11.5

Descriptive Statistics

Mean 11.5
Standard Deviation 6.9226
Variance 47.9221
Sum 100740
Skewness 0
Kurtosis -1.2042
Coefficient of Variation 0.602
  • hour is not normally distributed (p-value 8.530609293613251e-198)

day

numerical

Approximate Distinct Count 31
Approximate Unique (%) 0.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 15.7205
Minimum 1
Maximum 31
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • day is skewed right (γ1 = 0.0075)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 8
Median 16
Q3 23
95-th Percentile 29
Maximum 31
Range 30
IQR 15

Descriptive Statistics

Mean 15.7205
Standard Deviation 8.7967
Variance 77.3828
Sum 137712
Skewness 0.007521
Kurtosis -1.1932
Coefficient of Variation 0.5596
  • day is not normally distributed (p-value 1.4189750116923632e-65)

month

numerical

Approximate Distinct Count 12
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 6.526
Minimum 1
Maximum 12
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • month is skewed left (γ1 = -0.0105)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 4
Median 7
Q3 10
95-th Percentile 12
Maximum 12
Range 11
IQR 6

Descriptive Statistics

Mean 6.526
Standard Deviation 3.448
Variance 11.889
Sum 57168
Skewness -0.01046
Kurtosis -1.2071
Coefficient of Variation 0.5284
  • month is not normally distributed (p-value 0.003477519175499682)

year

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 590.3 KB
  • The largest value (2021) is over 5.19 times larger than the second largest value (2022)

Length

Mean 4
Standard Deviation 0
Median 4
Minimum 4
Maximum 4

Sample

1st row 2021
2nd row 2021
3rd row 2021
4th row 2021
5th row 2021

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 35040
  • The top 2 categories (2021, 2022) take over 50.0%
  • The largest value (2021) is over 5.19 times larger than the second largest value (2022)
  • year has words of constant length

holiday

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 598.6 KB
  • The largest value (False) is over 44.62 times larger than the second largest value (True)

Length

Mean 4.9781
Standard Deviation 0.1464
Median 5
Minimum 4
Maximum 5

Sample

1st row False
2nd row False
3rd row False
4th row False
5th row False

Letter

Count 43608
Lowercase Letter 34848
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (False, True) take over 50.0%
  • The largest value (false) is over 44.62 times larger than the second largest value (true)

dayofweek_n

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 564.6 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 8760
  • dayofweek_n has words of constant length

dayofweek

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 617.1 KB

Length

Mean 7.1397
Standard Deviation 1.125
Median 7
Minimum 6
Maximum 9

Sample

1st row Monday
2nd row Monday
3rd row Monday
4th row Monday
5th row Monday

Letter

Count 62544
Lowercase Letter 53784
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0

working_day

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 592.9 KB
  • The largest value (True) is over 2.32 times larger than the second largest value (False)

Length

Mean 4.3014
Standard Deviation 0.4589
Median 4
Minimum 4
Maximum 5

Sample

1st row True
2nd row True
3rd row True
4th row True
5th row True

Letter

Count 37680
Lowercase Letter 28920
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (True, False) take over 50.0%
  • The largest value (true) is over 2.32 times larger than the second largest value (false)

season

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 607.4 KB

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row Winter
2nd row Winter
3rd row Winter
4th row Winter
5th row Winter

Letter

Count 52560
Lowercase Letter 43800
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Summer, Spring) take over 50.0%
  • season has words of constant length

peak

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 596.3 KB
  • The largest value (False) is over 2.44 times larger than the second largest value (True)

Length

Mean 4.7089
Standard Deviation 0.4543
Median 5
Minimum 4
Maximum 5

Sample

1st row False
2nd row False
3rd row False
4th row False
5th row False

Letter

Count 41250
Lowercase Letter 32490
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (False, True) take over 50.0%
  • The largest value (false) is over 2.44 times larger than the second largest value (true)

timesofday

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 614.5 KB

Length

Mean 6.8333
Standard Deviation 1.5185
Median 7
Minimum 5
Maximum 9

Sample

1st row Night
2nd row Night
3rd row Night
4th row Night
5th row Night

Letter

Count 59860
Lowercase Letter 51100
Space Separator 0
Uppercase Letter 8760
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Night, Afternoon) take over 50.0%

rain_type

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 618.1 KB
  • The largest value (no rain) is over 16.91 times larger than the second largest value (drizzle)

Length

Mean 7.2579
Standard Deviation 1.1683
Median 7
Minimum 7
Maximum 13

Sample

1st row no rain
2nd row no rain
3rd row no rain
4th row no rain
5th row no rain

Letter

Count 55284
Lowercase Letter 55284
Space Separator 8295
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (no rain, drizzle) take over 50.0%
  • The largest value (rain) is over 17.84 times larger than the second largest value (drizzle)

count

numerical

Approximate Distinct Count 25
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 136.9 KB
Mean 3.7807
Minimum 0
Maximum 26
Zeros 1794
Zeros (%) 20.5%
Negatives 0
Negatives (%) 0.0%
  • count is skewed right (γ1 = 1.1671)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 1
Median 3
Q3 6
95-th Percentile 11
Maximum 26
Range 26
IQR 5

Descriptive Statistics

Mean 3.7807
Standard Deviation 3.6198
Variance 13.1028
Sum 33119
Skewness 1.1671
Kurtosis 1.3939
Coefficient of Variation 0.9574
  • count is not normally distributed (p-value 1.5242283348026536e-10)
  • count has 143 outliers

Interactions

Correlations

Missing Values